# Multilingual Visual Reasoning
Internvl3 38B Instruct GGUF
Apache-2.0
InternVL3-38B-Instruct is an advanced Multimodal Large Language Model (MLLM) that demonstrates exceptional overall performance, with strong multimodal perception and reasoning capabilities.
Image-to-Text
Transformers

I
unsloth
1,236
2
Llama 4 Maverick 17B 128E Instruct
Other
Llama 4 Maverick is a 17-billion-parameter multimodal Mixture of Experts (MoE) model from Meta, supporting 12 languages and image understanding, suitable for commercial and research applications.
Multimodal Fusion
Transformers Supports Multiple Languages

L
RedHatAI
29
1
Trillion LLaVA 7B FP16
Apache-2.0
Trillion-LLaVA-7B is a vision-language model with image understanding capabilities, trained on English visual-language instruction pairs, demonstrating exceptional cross-lingual visual reasoning abilities.
Text-to-Image
Transformers Supports Multiple Languages

T
trillionlabs
14
0
Internvl3 1B AWQ
Other
InternVL3-1B is a multimodal large language model in the InternVL3 series, featuring exceptional multimodal perception and reasoning capabilities.
Text-to-Image
Transformers Other

I
OpenGVLab
303
1
Internvl3 1B
Other
InternVL3-1B is a 1B-parameter multimodal large language model in the InternVL3 series, integrating the InternViT visual encoder and Qwen2.5 language model, with exceptional multimodal perception and reasoning capabilities.

I
FriendliAI
71
1
Featured Recommended AI Models